Discovering Mis-Categorized Entities
نویسندگان
چکیده
Entity categorization – the process of grouping entities into categories – is an important problem with a great many applications. Unfortunately, in practice, many entities are mis-categorized, such as Google Scholar and Amazon products. In this paper, we study the problem of discovering miscategorized entities from a given group of entities. This problem is inherently hard: all entities within the same group have been “well” categorized by state-of-the-art solutions. Apparently, it is nontrivial to differentiate them. We propose a novel rule-based framework to solve this problem. It first uses positive rules to compute disjoint partitions of entities, where the partition with the largest size is taken as the correctly categorized partition, namely the pivot partition. It then uses negative rules to identify mis-categorized entities in other partitions that are dissimilar to the entities in the pivot partition. We describe optimizations on applying these rules, and discuss how to generate positive/negative rules. Extensive experimental results on real-world datasets show the effectiveness of our solution.
منابع مشابه
Cleaning Your Wrong Google Scholar Entries
Entity categorization – the process of grouping entities into categories for some specific purpose – is an important problem with a great many applications, such as Google Scholar and Amazon products. Unfortunately, many real-world categories contain mis-categorized entities, such as publications in one’s Google Scholar page that are published by the others. We have proposed a general framework...
متن کاملDiscovering Unobserved Heterogeneity in Structural Equation Models to Avert Validity Threats
Decision Support System (DSS) Implementation Success Alavi and Joachimsth aler 1992, MISQ 144 findings from 33 studies Investigating the relationship between user-related factors and DSS implementation success Authors suggest that moderators could explain the large variance in effect sizes across studies. “Reviews of information systems implementation research...have revealed that collectively,...
متن کاملJamming with Social Media: How Cognitive Structuring of Organizing Vision Facets Affects IT Innovation Diffusion
We conducted our study in two stages. In Stage I, aimed at discovering organizational actors’ meanings of social media, we focused on category emergence. In Stage II, aimed at developing causal insights around those meanings, we focused on relationships among categories surfaced at Stage I. In Table A1, we map our methods to the methodologies specified by the two main grounded theory proponents...
متن کاملUnderstanding User Revisions When Using Information Systems Features: Adaptive System Use and Triggers
Beaudry and Pinsonneault (2005) IT related coping behaviors System users choose different adaptation strategies based on a combination of primary appraisal (i.e., a user’s assessment of the expected consequences of an IT event) and secondary appraisal (i.e., a user’s assessment of his/her control over the situation). Users will perform different actions in response to a combination of cognitive...
متن کاملImprovement Mechanisms of Management Information System (MIS) In Iran's Agricultural Extension Organization
This research describes the MIS improvement mechanisms in Iran's Agricultural Extension Organization. A survey study was applied as a methodology of research work. Data were collected using a structured questionnaire that addressed to evaluating managers’ responses regarding to MIS improvement mechanisms. All mechanisms had mean score greater than 5 as perceived by managers which implied that m...
متن کامل